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Abstract 

We  examine  how  economic  stratification  affects  inequality  and  growth  over  time.  We  study  economies 
where  heterogeneous  agents  interact  through  local  public  goods  or  externalities  (school  funding,  neigh- 
borhood effects)  and  economy- wide  linkages  (complementary  skills,  knowledge  spillovers).  We  compare 
growth  and  welfare  when  families  are  stratified  into  homogeneous  local  communities  and  when  they  remain 
integrated.  Segregation  tends  to  minimize  the  losses  from  a  given  amount  of  heterogeneity,  but  integra- 
tion reduces  heterogeneity  faster.  Society  may  thus  face  an  intertemporal  tradeoff:  mixing  leads  to  slower 
growth  in  the  short  run,  but  to  higher  output  or  even  productivity  growth  in  the  long  run.  This  tradeoff 
occurs  in  particular  when  comparing  local  and  national  funding  of  education,  which  correspond  to  special 
cases  of  segregation  and  integration.  More  generally,  we  identify  the  key  parameters  which  determine 
which  structure  is  more  efficient  over  short  and  long  horizons.  Particularly  important  are  the  degrees  of 
complementarity  in  local  and  in  global  interactions. 


1      Introduction 

In  the  United  States  a  family's  income,  assets,  education  level,  ethnic  background  and  lifestyle  can  be 
predicted  quite  accurately  from  its  zip  code  -and  this  in  spite  of  the  great  diversity  of  the  American 
population.1  This  strong  degree  of  social  and  economic  segregation,  epitomized  of  late  by  the  spread  of 
gated  communities,  is  reflected  in  wide  disparities  in  the  funding  and  quality  of  local  public  services,  such 
as  primary  and  secondary  education  or  law  enforcement.  It  also  manifests  itself  in  the  increasingly  different 
types  of  behavior  and  values  to  which  the  young  are  exposed  during  their  formative  years. 

The  production  of  goods  and  services  thus  brings  together  on  the  factory  floor  and  at  the  office  workers 
on  the  one  hand,  managers  and  professionals  on  the  other,  whose  upbringing  and  levels  of  human  capital 
are  becoming  increasingly  disparate.  Could  this  polarization  of  educational  opportunities  and  outcomes 
be  a  contributing  factor  not  only  to  the  widening  inequality  in  income  and  wealth  observed  over  the  last 
decade,  but  also  to  the  slowdown  of  productivity  growth?  One  reads  for  instance  in  the  MIT  Commission's 
Report  on  Industrial  Productivity: 

"American  and  foreign  students  differ  not  only  in  their  average  scores  on  standardized  tests 
but  also  in  the  dispersion  of  those  scores  around  the  mean.  The  Japanese  aim  at  bringing  all 
students  to  a  high  common  level  of  competence,  and  they  are  largely  successful;  as  a  result 
. .  .new  entrants  to  the  Japanese  work  force  are  generally  literate,  numerate,  and  prepared  to 
learn.  In  the  U.S.  work  force,  employers  have  discovered  high  rates  of  illiteracy  and  difficulty 
with  basic  mathematics  and  reading  in  workers  with  high  school  diplomas  . . .  Only  a  tiny  fraction 
of  young  Americans  are  technologically  literate  and  have  some  knowledge  of  foreign  societies. 
(Made  in  America,  (1989)) 

The  U.S.  workforce  is  of  course  much  more  heterogeneous  than  the  Japanese.  But  this  greater  heterogeneity 
may  be  in  good  part  endogenous,  precisely  because  any  exogenous  differences  in  human  capital  (due  to 
historical  factors,  immigration,  or  just  plain  luck)  are  magnified  by  economic  or  social  stratification.   In 


1  Weiss  (1989)  provides  a  comprehensive  and  lively  description  of  the  "clusters"  used  in  commercial  and  political  marketing. 


fact,  a  more  diverse  population  may  increase  the  value  of  integration,  as  it  accelerates  the  homogenization 
of  the  labor  force.  Motivated  by  these  observations,  this  paper  investigates  the  relationship  between  the 
extent  to  which  an  economy  is  stratified  into  homogeneous  communities,  its  degree  of  income  inequality, 
and  its  growth  performance  over  time. 

The  central  question  is  the  following:  given  a  heterogeneous  population,  which  social  structure  is  more 
efficient:  segregation  by  income  and  education,  or  integration?  It  includes  as  a  special  case  the  issue  of 
whether  public  education  should  be  funded  locally  or  nationally.  We  formalize  these  issues  in  a  simple 
but  quite  general  growth  model  with  both  economy-wide  linkages  (such  as  complementarity  in  production 
or  knowledge  spillovers)  and  local,  community-level  externalities  or  public  goods  (such  as  locally  funded 
schools  or  neighborhood  effects).  We  show  that  the  answer  to  the  question  raised  above  depends  on  two 
basic  effects,  which  can  give  rise  to  an  interesting  intertemporal  tradeoff. 

The  first  effect  measures  how  efficient  each  social  structure  is  at  processing  heterogeneity,  i.e.  at  aggre- 
gating disparate  levels  of  human  capital  into  the  production  of  goods  and,  ultimately,  new  knowledge.  We 
show  that  when  family  background  and  community  quality  are  complements  in  a  child's  education,  a  segre- 
gated economy  tends  to  have  smaller  instantaneous  losses,  hence  faster  growth,  for  any  given  amount  of  het- 
erogeneity. The  second  effect  is  dynamic:  because  an  integrated  society  is  better  at  reducing  heterogeneity, 
it  converges  faster  to  a  homogeneous  outcome-or  in  the  presence  of  shocks,  converges  to  a  less  unequal 
distribution  of  skills  and  income.  Integration  thus  delivers  much  of  its  payoff  over  the  course  of  several 
generations.  Is  this  effect  important  enough  for  mixing  to  be  more  efficient  in  the  long  run,  even  when  it 
leads  to  losses  in  the  short  run? 

We  show  that  the  answer  tends  to  be  affirmative,  so  that  integration  at  first  hurts  the  better  off  but 
eventually  raises  everyone's  income.  It  is  then  Pareto  improving,  without  need  for  redistribution,  provided 
agents  have  a  low  enough  discount  rate.  Conversely,  increased  segregation  leads  to:  (i)  a  widening  in  the 
distribution  of  income;  (ii)  a  short-lived  burst  of  growth,  benefiting  mostly  the  better  off  households;  (iii) 
a  decline  in  the  economy's  long-run  level  of  output,  or  even  in  the  long-run  growth  rate.  Remarkably, 
these  results  remain  even  as  all  global  complementarities  linking  together  rich  and  poor  families  become 


vanishingly  small.  ■» 

Of  course,  it  need  not  be  the  case  that  mixing  is  preferable  in  the  long  run.  For  instance,  we  show  that 
stratification  remains  more  efficient  if  the  degree  of  complementarity  between  agents'  levels  of  human  capital 
is  much  stronger  in  local  interactions  than  in  global  interactions.  Intuitively,  this  means  that  disparities  in 
knowledge  at  the  community  level  (e.g.  in  schooling)  entail  sufficiently  greater  losses  than  at  the  aggregate 
level  (e.g.  in  production). 

What  this  paper  offers  is  thus  a  framework  in  which  the  costs  and  benefits  of  different  degrees  of 
stratification  can  be  spelled  out.  It  also  helps  to  clarify  some  issues  of  aggregation  which  arise  when 
externalities  are  combined  with  heterogeneity.  In  models  with  a  representative  agent,  it  is  irrelevant  how 
spillovers  are  specified.  But  as  we  move  to  models  with  heterogeneity,  the  choice  of  aggregator  becomes 
crucial.  Thinking  about  extreme  cases,  this  may  seem  obvious;  naturally  the  equilibrium  will  be  different  if 
the  spillover  is  closer  to  the  minimum,  to  some  average,  or  to  the  maximum  of  all  agent's  actions.  But  we 
show  that  one  cannot  even  identify  -as  is  common  practice-  the  mean  of  the  logs  and  the  log  of  the  mean 
(the  geometric  and  arithmetic  averages)  without  significantly  changing  the  economy's  long  run  growth 
rate,  as  well  as  the  normative  conclusions  about  the  efficiency  of  stratification  and  integration.  One  must 
therefore  go  beyond  rough  intuitions  and  pin  down  the  key  parameters  which  determine  how  heterogeneity 
affects  growth.  The  aim  is  not  only  to  clarify  the  properties  implicitly  embodied  in  the  specifications  of 
previous  models,  but  also  to  provide  a  guide  for  future  empirical  work  on  spillovers  and  neighborhood 
or  peer  effects. 2  In  addition  to  the  overall  degree  of  returns  to  scale  and  the  relative  weights  of  family, 
community  and  economy-wide  inputs  in  the  production  of  human  capital,  we  bring  to  light  the  crucial 
role  played  by  the  three  elasticities  of  substitution  which  operate  within  local  interactions,  within  global 
interactions,  and  between  all  three  inputs. 

This  paper  draws  on  previous  work  by  Benabou  (1991),  Tamura  (1991a,b)  and  Glomm  and  Ravikumar 
(1992).  Benabou  (1991)  demonstrated  that  in  a  general  equilibrium  context,  agents'  incentives  to  segregate 


In  a  sense  our  effort  is  similar,  in  a  macroeconomic  and  dynamic  context,  to  Arnott  and  Rowse's  (1987)  study  of  the 
mapping  between  various  forms  of  peer  effects  and  optimal  school  structure. 


themselves  due  to  the  presence  of  local  externalities  or  public  goods  can  have  important  effects  on  aggregate 
productivity  and  welfare,  even  with  perfect  capital  markets.  But  the  model  was  static,  hence  not  suited 
to  study  growth  or  the  idea  that  the  merits  of  integration  and  segregation  may  look  very  different  in  the 
short  and  in  the  long  run.  Being  a  model  with  ex-ante  identical  agents,  it  could  also  not  address  the 
issue  of  inequality,  whether  due  to  initial  conditions  or  ongoing  shocks;  nor  could  it  capture  the  notion  of 
homogenization  over  time. 

Tamura  (1991a)  studies  endogenous  growth  when  heterogeneous  agents  are  linked  by  an  economy- wide 
human  capital  spillover,  and  shows  two  main  results.  First,  because  individual  accumulation  is  subject 
to  decreasing  returns,  heterogeneity  slows  down  growth.  Second,  this  effect  is  only  temporary,  as  in  the 
long  run  the  economy  converges  to  a  homogeneous  outcome.  Tamura  uses  simulations  to  demonstrate  the 
vanishing  impact  of  heterogeneity.  In  this  paper  we  provide  analytical  solutions  for  the  economy's  entire 
dynamic  path,  in  a  general  class  of  models  with  both  local  and  global  spillovers.  This  allows  us  to  raise  the 
issue  of  how  stratification  affects  growth,  and  to  answer  it  by  showing  how  the  losses  per  unit  of  dispersion, 
the  convergence  speed  and  their  interaction  differ  under  segregation  and  integration. 

Glomm  and  Ravikumar  (1992)  compare  growth  under  private  and  public  education.  In  a  private  system 
parents  buy  education  for  their  own  children  and  there  are  no  spillovers.  In  a  public  system  education  is 
a  nationally  funded  public  good,  generating  again  a  global  spillover.  Glomm  and  Ravikumar  show  that 
a  private  system  offers  students  better  incentives  to  invest  in  human  capital,  and  thus  leads  to  a  higher 
long-run  growth  rate.  They  also  provide  an  example  where  heterogeneity  can  cause  a  public  education 
economy  to  grow  faster  for  a  while;  but  they  do  not  analyze  why  this  is  so.  We  make  clear  the  role  played 
by  heterogeneity  in  each  system's  performance.  Most  importantly,  we  show  that  when  children's  ability  or 
adults'  human  capital  is  subject  to  random  shocks,  public  education  may  in  fact  lead  to  faster  long-term 
growth.  If  there  is  even  a  very  small  amount  of  complementarity  in  the  production  sector,  a  move  to  public 
education  can  be  Pareto  improving,  provided  families'  intergenerational  discount  rate  is  low  enough.  In 
any  case,  both  private  and  nationally  funded  education  systems  dominate  locally  funded  public  education, 
which  has  neither  the  incentive  properties  of  the  former  nor  the  homogenization  properties  of  the  latter. 


Finally,  this  paper  is  also  closely  related  to  Durlauf  (1992)  and  S.  Cooper  (1992),  through  a  shared 
concern  about  the  effects  of  stratification  in  a  dynamic,  stochastic  economy.  Durlauf  shows  how  community 
formation  and  local  funding  of  education  can  generate  path-dependence  in  lineage  income,  trapping  some 
families  in  pockets  of  poverty  while  others  enjoy  growth.  S.  Cooper  (1992)  incorporates  redistribution  into 
his  model,  by  allowing  communities  linked  through  production  externalities  to  vote  on  cross-subsidies.  Our 
main  concern  here  is  how  stratification  affects  the  growth  performance  and  efficiency  of  the  whole  economy, 
with  particular  emphasis  on  the  potential  tradeoff  between  the  short  and  the  long  run. 

Section  2  presents  a  model  of  education  and  production  which  motivates  and  sets  up  the  basic  problem. 
It  then  shows  how  similar  issues  arise  in  several  other  models,  and  provides  a  general  framework  in  which 
to  study  them.  Section  3  examines  how  heterogeneity  affects  short  and  long  run  growth  in  a  segregated  and 
in  an  integrated  economy.  Section  4  shows  how  randomness  in  innate  ability  magnifies  the  long-run  effects 
of  stratification.  Section  5  considers  the  model's  implications  for  the  efficiency  of  nationally  funded  public 
education,  locally  funded  public  education,  and  private  education.  Section  6  concludes.  The  functional 
forms  used  in  the  text  are  generalized  in  Appendix  A;  proofs  are  gathered  in  Appendix  B. 

2     Local  and  Global  Interactions  in  Human  Capital 

2.1     A  First  Model:  Education  and  Production 

We  first  consider  the  most  natural  channels  through  which  group-specific  and  economy-wide  complementar- 
ities arise  in  the  accumulation  of  human  capital:  local  funding  of  education,  and  imperfect  substitutability 
in  production.  There  is  a  continuum  of  overlapping  generation  families  i  €  0,  of  unit  measure.  During  each 
period  adults  work,  consume,  and  spend  time  rearing  their  single  child.  At  time  zero,  the  adult  member 
of  dynasty  i  faces  the  following  problem: 


maximize  Uq     =     Eq     2~Jp<u(c<)     i  subject  to: 

<=o 

(1)  c\     =     {l-ri)y\ 

(2)  y>     =     v\w\ 

(3)  ft«+1  =  K-ci-ai-riWiEiy-* 

and  hl0  given.  There  are  no  financial  assets,  only  human  capital.  At  time  t,  adult  i  has  human  capital  h\. 
She  spends  a  fraction  v\  of  her  unit  time  endowment  at  work,  earning  the  hourly  wage  w\,  and  devotes 
the  rest  to  helping  her  child  learn.  The  term  h't  in  (3)  could  also  be  due  in  part  to  inherited  ability;  the 
unpredictable  component  of  the  child's  innate  ability  is  represented  by  the  i.i.d.  shock  Q.  The  other  input 
in  the  production  of  human  capital  is  a  local  public  good,  which  is  financed  by  taxing  the  labor  income  of 
local  residents.  Per  capita  expenditures  are  therefore: 

(4)  E\=ri.YJ  =  Ti.  r  ydm\(y) 

Jo 

where  m\  is  the  distribution  of  income  and  Yt'  its  average,  in  the  community  fl|  to  which  family  i  belongs 
at  time  t:  city,  suburban  town,  state,  etc.  The  most  obvious  skill-enhancing  public  good  is  primary  and 
secondary  schooling.  But  law  enforcement,  libraries,  etc.,  are  also  relevant. 

To  simplify  the  model,  we  shall  take  both  the  fraction  of  time  u\  spent  working  and  the  tax  rate  r\  to 
be  constant  over  time  and  independent  of  community  composition.  If  agents  have  logarithmic  preferences, 
their  decisions  and  voting  choices  will  lead,  in  equilibrium,  to  such  invariant  rules.3  But  these  are  really 
"ceteris  paribus"  assumptions  by  which  we  abstract  from  the  issues  explored  in  the  political  economy 
literature,  in  order  to  focus  on  some  new  effects  . 


3This  in  shown  in  Section  5.2;  the  values  of  u  and  T  are  given  in  Proposition  6.  Log-utility  also  leads  to  constant  investment 
and  tax  rates  in  Tamura  (1991a)  and  Glomm  and  Ravikumar  (1992)  respectively.  Voting  models  of  education  with  variable 
tax  rates  are  analyzed  by  Perotti  (1990),  Saint-Paul  and  Verdier  (1991)  and  Fernandez  and  Rogerson  (1992),  among  others. 


We  now  turn  to  the  production  sector.  All  workers  take  part  in  the  production  of  a  numeraire  good, 
performing  complementary  tasks  or  specializing  in  imperfectly  substitutable  intermediate  inputs.  Thus 
total  output  is: 

(5)  Yt  =  v(i     h^dnt{h)\'~X  =v-Hu  (7>1 

where  fit  denotes  the  distribution  of  human  capital  in  the  whole  labor  force  Q.4  This  complementarity  is 
meant  to  capture  the  idea  that  poorly  educated,  insufficiently  skilled  production  or  clerical  workers  will 
drag  down  the  productivity  of  engineers,  managers,  doctors,  etc.  Conversely,  lagging  advances  in  knowledge 
by  scientists,  engineers  and  managers  will  mean  lagging  wages  for  basic  workers.  Such  interdependence 
seems  quite  plausible,  especially  as  we  allow  a  to  be  arbitrarily  large,  but  finite.  Given  (5),  any  worker's 
wage  and  labor  income  depend  positively  on  the  economy-wide  level  of  human  capital: 

(6)  y\  =  vw\  =  v(Ht)$(h\)st1, 
and  the  same  is  true  for  any  community's  level  of  per  capita  income: 

(7)  Yii=J\dm\(y)  =  v-(Ht)i-(jo°°h^1dfi\(h?)  =  v  •(//,)'•  (L\)^ , 

where  fi\  is  the  distribution  of  human  capital  in  community  £l\ .  Note  that  a  >  1  is  required  for  income 
to  increase  with  the  level  of  skills.  Incorporating  (4)  and  (7)  into  (3),  the  accumulation  of  human  capital 
takes  the  form: 

(8)  h\+1  =  e-ct-(h\nL\f(Hty 

where  a  =  6,  0  =  (1  -  6)(a  -  l)/<r,  7  =  (1  -  8)/a ,  with  a  +  /?  +  7  =  1  and  0  =  k  ■  (1  -  u)6(vt)1-6. 
Equation  (8)  involves  both  a  local  linkage  L\,  because  public  goods  are  funded  by  community  income,  and  a 


4  We  develop  in  appendix  a  variant  of  Ethier's  (1982)  model  of  specialization  which  leads  to  (5)  and  (6)  below;  see  the 
proof  of  Proposition  6.  Tamura  (1991b)  offers  a  model  leading  to  an  aggregate  production  function  closely  related  to  (5). 
Kremer  (1992)  and  R.  Cooper  (1992)  study  equilibrium  and  optimal  task  assignment  in  firms  or  production  teams. 


global  linkage  Ht  =  Yt/is,  because  all  workers  are  complementary  in  production.  Both  are  CES  aggregates, 
with  the  same  elasticity  of  substitution.  As  workers  become  better  substitutes,  communities  become  less 
interdependent:  /?  rises  and  7  falls. 

This  model  allows  us  to  ask  the  following  questions.  Is  it  more  efficient,  from  the  point  of  view  of 
maximizing  aggregate  output  and  growth,  to  have  the  population  stratify  into  homogeneous  communities 
(L\  =  h\)  or  mix  into  identical,  integrated  communities  (L\  =  Ht)t  Does  the  answer  depend  on  whether  one 
takes  a  short-run  or  long-run  perspective?  Can  integration  be  Pareto  improving  even  without  compensating 
transfers  to  richer  families?  These  issues  can  also  be  rephrased  in  a  more  directly  policy-relevant  manner. 
Suppose  that  households  are  in  fact  geographically  segregated  by  education  and  income.  Which  system 
of  public  education  funding  is  then  more  efficient:  local  funding,  where  each  community's  school  budget 
reflects  the  income  of  its  residents,  or  national  funding,  where  expenditures  uniformly  reflect  national 
income?  Before  answering  these  questions  for  the  specific  model  developed  above,  we  show  that  similar 
issues  arise  very  naturally  in  a  wide  class  of  models  from  the  growth  and  human  capital  literatures. 

2.2     More  Models  and  a  Puzzle 

There  are  many  potential  channels  which  can  give  rise  to  a  model  with  local  and  economy-wide  interactions 
like  (8).  First,  aggregate  income  clearly  matters  if  either  production  or  education  uses  some  nationally 
provided  public  good,  such  as  defense  or  infrastructure.  Second,  technological  spillovers  a  la  Romer  (1986)- 
Lucas  (1988)  may  affect  workers'  productivity,  leading  to  individual  production  functions  of  the  type 
y\  —  0  •  (h\)a(Ht)b.  As  long  as  the  accumulation  of  knowledge  uses  produced  resources,  in  schooling  or 
R&D,  it  will  again  be  affected  by  Ht.  Alternatively,  Tamura  (1991a)  assumes  that  the  aggregate  level  of 
knowledge  directly  affects  the  generation  of  new  human  capital:  h\+1  =  0  ■  (h't)a(Ht)b . 

One  would  expect  some  knowledge  spillovers  to  be  confined  to  a  smaller  sphere  than  the  whole  econ- 
omy, if  only  because  geographical  distance  limits  frequency  of  interaction.  Indeed,  sociologists  have  long 
described,  and  economists  recently  modelled,  a  variety  of  channels  through  which  a  community's  human 
capital  makeup  affects  the  educational  outcome  of  its  young  people.    These  sources  of  "social  capital" 
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(Loury  (1977),  Coleman  (1990))  include:  peer  effects  between  students  of  different  ability  in  the  classroom 
or  within  the  school  (Banerjee  and  Besley  (1991));  the  fact  that  neighboring  adults  provide  role  models, 
good  or  bad,  as  well  as  networking  contacts  for  the  young  (Wilson  (1987),  Streufert  (1991)  and  Montgom- 
mery  (1990));  and  crime  or  other  activities  which  interfere  with  education.  There  is  also  a  fair  amount  of 
empirical  evidence  of  such  peer  or  neighborhood  effects;  see  Benabou  (1991)  for  a  brief  review.  In  contrast 
to  fiscal  spillovers,  pure  neighborhood  effects  can  generally  not  be  remedied  by  simply  improving  access  to 
capital  markets  or  by  redistributing  income  across  communities.5 

Of  course  any  combination  of  the  channels  discussed  here  and  in  the  preceding  section  is  possible,  even 
likely.  But  their  common  feature  is  the  presence  of  a  local  aggregate  L\  and  perhaps  an  economy-wide 
aggregate  Ht  in  the  production  function  for  new  human  capital.  The  question  then  arises  again:  is  it  more 
efficient  -in  the  sense  of  increased  output  and  in  the  Pareto  sense-  for  society  to  stratify  into  homogeneous 
clusters,  or  for  each  community  to  reflect  the  nation-wide  distribution  of  human  capital? 

Naturally,  one  expects  the  answer  to  depend  on  the  form  which  interactions  take.  To  show  the  sur- 
prising extent  to  which  this  is  true,  introduce  the  idea  of  local  elasticity  of  substitution,  and  give  some 
empirical  flesh  to  the  discussion  of  non-fiscal  spillovers,  let  us  consider  the  following  example.  Borjas 
(1992a)  investigates  whether  human  capital  externalities  operate  within  ethnic  groups.  Using  longitudinal 
data,  he  estimates  the  model: 

(9)  \og(h\+1)  =  alog(/ij)  +  (3\og(L\)  +  control  variables  +  r)\. 

where  h\+l  is  a  son's  level  of  human  capital,  measured  as  his  hourly  wage;  h\  is  his  father's  level  of  human 
capital;  and  L\  is  "ethnic  capital" ,  defined  as  the  geometric  average  of  human  capital  levels  among  adults 
in  the  father's  ethnic  group:  log(L{)  =  /0°°  \og(h)dfj,\(h).  Finally,  r/J  is  an  unpredictable  individual  shock. 
Borjas  estimates  both  a  and  /?  to  be  between  .25  and  0.30  and  statistically  significant.6 


5  Benabou  (1991)  shows  that  they  can  lead  to  inefficient  self-stratification  even  in  a  representative  agent  model  with  perfect 
capital  markets. 
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Borjas  also  uses  years  of  education  instead  of  log-wages.     The  remarks  made  below  concerning  the  consequences  of 


These  results  are  very  interesting  in  and  of  themselves,  adding  to  the  body  of  evidence  that  group 
interactions  influence  the  acquisition  of  skills.  But  they  also  raise  the  following  question.  From  the 
(narrow)  point  of  view  of  maximizing  aggregate  income,  is  it  more  efficient  that  ethnic  groups  mix,  i.e.  live 
together,  study  together,  etc.,  or  that  they  remain  separate?7  One  might  hope  to  use  Borjas'  estimates 
to  answer  this  question.  Unfortunately,  we  shall  see  that  by  using  a  geometric  average,  (9)  constrains  the 
answer  a  priori:  for  any  values  of  a  and  /?,  the  path  of  total  labor  income  is  always  higher  if  each  child 
is  exposed  to  his  own  group's  average  human  capital  than  if  all  are  exposed  to  the  population  average.  In 
the  long-run  the  two  paths  converge  to  the  same  level  if  a  +  (3  <  1. 

This  conclusion  seems  rather  distressing.  But  suppose  that  instead  of  assuming  that  ethnic  capital 
operates  through  the  geometric  average,  one  had  used  the  arithmetic  average:  log(L{)  =  log(/0  hdfi't(h)). 
As  we  show  later  on,  capital  accumulation  will  now  be  generally  more  efficient  under  mixing  than  under 
segregation,  especially  in  the  long  run.  In  this  instance,  the  common  practice  of  not  distinguishing  between 
the  mean  of  the  logs  and  the  log  of  the  mean  can  be  quite  misleading:  with  heterogeneous  agents,  Jensen's 
inequality  implies  that  both  individual  and  the  economy's  growth  rates  depend  on  the  aggregator  through 
which  the  externality  operates.  This  point  is  in  fact  independent  of  the  stratification  issue:  if  there  is 
heterogeneity  within  each  ethnic  group,  an  equation  like  (9)  will  be  misspecified  unless  the  particular 
aggregator  which  it  imposes  happens  to  be  the  correct  one. 

These  remarks  demonstrate  the  importance  of  the  elasticity  of  substitution  among  individual  inputs  into 
the  production  of  a  peer  effect  or  neighborhood  externality;  the  underlying  intuition  is  developed  below. 
It  will  therefore  be  quite  important  in  empirical  work  to  estimate  this  parameter,  rather  than  constraining 
it  to  either  one  or  infinity  as  is  usually  done.8 


within-group  inequality  and  inter-group  mixing  are  quite  general,  and  apply  to  that  specification  as  well. 

This  question  is  not  purely  hypothetical;  one  suspects,  and  Borjas'  (1992b)  later  work  indeed  tends  to  indicate,  that 
"ethnic  capital"  really  arises  from  neighborhood  effects  combined  with  ethnic  segregation. 

One  could  specify  L|  as  a  CES  index  and  estimate  its  elasticity  t  from  a  non-linear  regression,  or  more  simply  include  in 
(9)  the  group's  variance  (A|)2  of  log-human  capital.  Its  coefficient  will  provide  an  estimate  of  —  1/e;  see  Section  3.1 
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2.3     A  General  Framework 

Recognizing  in  the  various  examples  discussed  above  a  common  underlying  structure,  we  shall  consider 
from  here  on  the  general  model  of  knowledge  accumulation: 

(10)  h\+1  =  F(Clh\,L\,Ht) 

where  Q  is  a  random  shock,  h\  is  parental  human  capital,  and  L't,  Ht  are  respectively  a  local  and  an 
economy-wide  index  of  human  capital.  In  general,  equation  (10)  is  not  a  purely  technological  assumption, 
but  a  reduced  form  which  already  embodies  a  variety  of  market  and  non-market  interactions:  equilibrium 
wages,  financing  of  local  public  goods,  technological  spillovers,  peer  effects  in  schooling,  etc.  Our  earlier 
examples  incorporated  at  most  one  local  and  one  global  link  at  a  time,  but  we  discuss  below  how  multiple 
externalities  can  be  reduced  to  (10),  where  L\  and  Ht  are  appropriate  composite  indices. 

The  two  levels  of  interaction  in  (10)  open  up  the  possibility  of  an  intertemporal  tradeoff.  When  het- 
erogeneous families  share  the  same  school  or  community,  or  when  some  input  into  education  is  equalized, 
the  rich  lose  and  the  poor  gain.  The  net  loss,  positive  or  negative,  represents  the  efficiency  cost  of  local 
heterogeneity.  It  must  be  weighted  against  the  value,  positive  or  negative,  of  a  more  homogeneous  work- 
force in  the  next  generation.  Intuitively,  heterogeneity  causes  greater  losses  at  the  local  and  global  level, 
the  less  substitutable  are  individual  human  capital  inputs  in  L\  and  Ht  respectively.  We  therefore  specify 
the  external  effects  as  (symmetric)  CES  averages,  with  potentially  different  elasticities  of  substitution: 

(11)  L\=(j°°  h^dMh)Y" 
(12).                                                 Ht  =  (J°°  h'^1  <WoV~ 

While  Ht  is  computed  over  the  whole  population,  L\  only  reflects  the  composition  of  the  group  fi{  to  which 
individual  i  belongs  at  time  t.  Depending  on  the  context  this  can  be  a  school,  a  community,  a  region,  even 
a  country. 
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We  will  show  that  the  costs  of  heterogeneity  in  L\  and  Ht  are  indeed  measured  by  1/c  and  l/<r.  As 
indicated  on  Figure  1,  we  allow  them  to  take  any  values  -even  negative  ones.  In  that  case  agents  are 
substitutes  rather  than  complements,  and  inequality  is  a  source  of  gains.  9  As  1/cr  decreases  from  +00  to 
—00,  Ht  spans  the  whole  range  from  a  Leontieff  technology,  Ht  =  min{h\,i  G  fl},  to  a  "frontier"  technology, 
Ht  =  max{h\,i  £  fi}.  The  latter  case  corresponds  to  the  model  of  Murphy,  Shleifer  and  Vishny  (1991), 
where  the  best  innovation  becomes  embodied  into  the  next  generation  of  technology  or  know-how.  Similarly 
at  the  local  level,  we  allow  all  cases  between  peer  effects  of  the  type  "one  bad  apple  spoils  the  bunch"  to  role 
models  where  the  best  individual  sets  the  standard.  More  generally,  the  accumulation  of  human  capital 
may  involve  several  interactions  at  each  level,  say: 

(10')  h\+\  =  F{Q\IAt,...LlKt\Hitt,...HNtt) 

For  instance,  H\ti  could  arise  from  complementarity  in  the  production  of  goods,  which  makes  heterogeneity 
costly  (l/o-!  >  0)  and  is  priced  through  wages,  while  i72,t  could  be  associated  to  the  generation  of  non-rival, 
non-excludable  new  ideas,  where  inequality  is  efficient  (I/02  <  0).  But  all  local  and  global  spillovers  will 
matter  only  through  two  weighted  averages:  we  show  in  appendix  that  (10')  reduces  to  (10),  where  L\ 
and  Ht  are  appropriately  defined. Finally,  another  important  specification  is  that  of  F(-),  the  production 
function  for  new  human  capital.  The  literature  almost  universally  assumes  the  multiplicative  form  (8) 
(with  either  j3  or  7  equal  to  zero),  constraining  the  elasticity  of  substitution  between  h\,  L\  and  Ht  to 
equal  one.  To  simplify  the  exposition,  we  retain  a  Cobb-Douglas  specification  for  most  of  the  paper.  In 
appendix,  we  generalize  F(-)  to  a  CES  aggregator,  and  show  how  the  effects  of  heterogeneity  and  social 
structure  also  depend  importantly  on  the  extent  to  which  parental,  local  and  national  inputs  in  a  person's 
education  are  substitutes  or  complements. 


For  a  CES  index  with  constant  returns  such  as  (12),  or  just  H(x,y)  =  (.jr  «  +  j  y  «■  )»-»  ,  1/ct  simultaneously 
measures  (in  the  neighborhood  of  x  =  y)  the  complementarity  between  inputs,  the  concavity  of  H  and  the  cost  of  heterogeneity. 
The  last  property  -which  is  relevant  here-  remains  true  with  any  return  to  scale  (when  the  index  is  raised  to  some  power  7, 
as  in  (8))  whereas  the  first  two  do  not.  Indeed,  H12  is  then  proportional  to  7  —  1  +  1/<t,  H\\  and  H22  to  7  —  1  —  1/cr  and 
det(H")  =  #n  H22  -  (H12)2  to  72(1  -  i)/cr,  but  log((if(2±i,  ^-)/H(x,y))  is  simply  proportional  to  7/17;  see  Section  3.1. 

12 


H  =  maxfh1 }     «- 


L  =  min{hM 


1  + 

£ 


local  complements, 
global  substitutes 


local  and 
global  substitutes 


local  and 

global  complements 


->      H  =  minfh1} 


o 


local  substitutes, 
global  complements 


L  =  maxfh1} 


Figure  1:    The  costs  of  heterogeneity:    local  and  global 
degrees  of  complementarity  1/e  and  1/a. 
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2.4     Community  Composition 

Given  the  general  model  described  by  (10)-(12),  we  shall  compare  the  dynamics  of  human  capital  accu- 
mulation and  welfare  under  two  regimes  of  interest.  The  first  is  perfect  stratification,  where  each  type  of 
agent  lives  in  a  separate,  homogeneous  community.  The  second  is  perfect  integration,  where  each  commu- 
nity's composition  is  the  same  as  that  of  the  population  at  large.  Intermediate  cases  of  partial  segregation 
could  easily  be  considered  using  the  same  methods;  but  we  do  not  seek  in  this  paper  to  offer  a  theory  of 
endogenous  community  formation.  This  is  for  three  reasons. 

First,  Benabou  (1991)  already  provides  such  a  model,  where  a  differential  sensitivity  to  the  quality  of 
their  environment  leads  high  and  low-skill  workers  to  segregate  as  much  as  technology  and  institutions 
permit.  The  same  basic  force  is  at  work  here:  with  d2F/dhdL  >  0,  parental  human  capital  h\  and 
local  public  goods  or  externalities  L\  are  strategic  complements  in  the  production  of  h\+l.  This  tends 
to  make  more  educated  parents  willing  to  outbid  less  educated  ones  for  land  or  housing  in  a  "better" 
community.  One  could  thus  obtain  once  again  segregation  as  the  only  stable  equilibrium,  sustained  by  land 
rent  differentials.  Alternatively,  one  could  follow  Durlauf  (1992)  and  allow  each  community's  residents  to 
vote  on  zoning  or  minimum  income  requirements.  In  the  absence  of  significant  fixed  costs,  the  rich  have  no 
desire  to  let  in  the  poor,  so  this  would  again  lead  to  stratification.  Implementing  either  approach,  however, 
would  require  tying  oneself  to  a  specific  choice  of  preferences  and  of  the  "technology"  of  segregation:  price- 
elasticity  of  housing  supply  in  each  location,  school  district  boundaries,  feasibility  of  zoning,  size  of  setup 
costs  for  a  school  or  a  community,  mobility  costs,  etc. 

The  second  reason  is  that  the  mixing  and  sorting  regimes  correspond  to  alternative  policies:  local  or 
national  funding  of  schools,  tracking  or  busing,  mixed  income  housing,  etc.  The  last  reason  is  that  there 
are  many  sources  of  stratification  which  are  unrelated  to  parents'  concern  for  their  children's  education: 
differences  in  income,  tastes,  racial  segregation,  etc.  We  therefore  choose  to  be  agnostic  about  the  causes 
of  stratification  and  focus  on  the  growth  performances  of  two  "pure"  cases  which  deliver  the  main  insights: 
complete  segregation  and  complete  integration. 
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3     Stratification  and  Growth:  the  Short  and  the  Long  Run    -> 

Let  us  start  with  the  simplest  possible  case.  There  are  two  types  of  agents,  A  and  B,  with  measure  1/2 
each.  They  differ  only  by  their  initial  endowments  of  human  capital:  A  =  |  log(fto  /hf?)  >  0.  Thus  A2  is 
the  initial  variance  of  log-human  capital.10  We  abstract  from  all  uncertainty:  Q  =  1. 

3.1     Dynamics  and  Losses  from  Heterogeneity 

In  a  stratified  economy,  the  local  environment  compounds  family  differences: 

(13)  h\+1  =  e-(h\)a+P(Hty,  i  =  A,B. 

In  an  integrated  economy,  all  agents  share  in  the  same  level  of  local  externality  or  public  good,  Lf  =  Lf  = 
Lt.  Denoting  all  variables  in  the  integrated  economy  with  a  hat,  we  have: 

(14)  h\+1  =  0  ■  (h\)a(Lty(Htr,  i  =  A,  B. 

Mixing  agents  at  the  local  level  thus  has  two  effects.  First,  it  decreases  the  return  to  scale  on  parental 
human  capital  from  a  +  /?  to  a,  and  correspondingly  raises  the  return  to  scale  on  the  local  aggregate  from 
0  to  /?;  the  effect  of  Ht  remains  unchanged.  These  changes  in  the  effective  technology  of  human  capital 
accumulation  will  alter  the  impact  of  any  given  amount  of  heterogeneity  on  the  economy's  growth  rate,  in 
a  way  which  we  make  precise  below.  In  other  words,  one  of  the  two  social  structures  will  be  more  efficient 
at  aggregating  heterogeneous  levels  of  knowledge  than  the  other. 

The  second  effect  of  mixing  is  to  accelerate  convergence  to  a  homogeneous  society.    Denoting  A*  = 


'As  L\  depends  on  group  composition  but  not  group  size,  any  proportions  (a,l  —  a)  of  A  and  B  families  will  lead  to 


the  same  results,  with  A  =  y/ a(l  —  a)  ■  log(h^/h^).    More  generally,  the  Taylor  approximations  used  below  apply  to  any 
distribution  with  small  enough  dispersion. 
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\  ■  \og(h*/hf)  in  the  segregated  economy  and  At  =  |  •  log(hf/hf)  in  the  integrated  one,  we  have:       » 

(15)  A«  =  (a  +  /?)fA>a'A  =  A,. 

Intuition  suggests  that  these  two  effects  may  lead  to  an  intertemporal  tradeoff:  mixing  may  for  instance  be 
less  efficient  for  any  given  distribution  of  human  capital,  but  still  more  efficient  in  the  long  run  because  it 
reduces  heterogeneity  faster.11  This  issue  is  investigated  below;  but  first  we  must  determine  exactly  how 
heterogeneity  affects  growth.  Consider  for  instance  the  segregated  economy.  The  distribution  of  human 
capital  at  time  t  is  fully  described  by  the  degree  of  inequality  At  and  any  aggregate  index  of  human  capital. 
We  focus  on  the  per  capita  stock  of  knowledge  At  =  (hf  -f  hf)/2,  which  brings  out  the  role  played  by 
heterogeneity  most  clearly.12  From  (13)  we  get: 


1(+1  -  ... 


..(. 


g— 1  \   0--1 


In  a  representative  agent  economy  where  everyone  had  the  average  level  of  human  capital,  the  right  hand 
side  would  be  0- Af+P  -AJ  and  the  growth  rate  \og(At+i/At)  =  6  +  (a  +  P+y-l) log(At),  with  6  =  log(0). 
When  levels  of  knowledge  are  unequal,  however,  the  two  bracketed  terms  differ  from  A"+^  and  A]  due 
to  Jensen's  inequality.  The  differences  represent  the  losses  caused  by  heterogeneity  when  communities 
face  decreasing  returns  and  agents  are  complements  in  the  production  of  the  aggregate  Ht.  Conversely, 
heterogeneity  is  a  source  of  gain  if  a  +  /?  >  1  or  1/<t  <  0.  We  shall  often  focus  the  exposition  on  the  first 
case,  but  it  should  be  kept  in  mind  throughout  the  paper  that  the  model  allows  for  any  configuration  of 
parameters:  we  do  not  impose  that  inequality  be  bad  for  growth. 


In  the  general  model  (10)  we  define  greater  efficiency  as  increased  total  human  capital.  In  specific  models,  i.e.  given  a 
production  technology  and  preferences  (e.g.  Section  2.1)  we  shall  also  consider  aggregate  output  and  individual  welfare. 

It  will  be  clear  how  to  go  from  this  arithmetic  average  to  any  other  aggregate  index,  such  as  Ht.  At  is  a  logical  choice 
since  it  is  unaffected  by  dispersion,  and  therefore  not  biased  toward  segregation  or  integration:  as  At  >  At,  At  =  At  implies 
Ht  >  Ht  for  any  a  >  0,  and  vice-versa  for  a  <  0. 
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We  can  simplify  the  expression  for  the  growth  rate  under  heterogeneity  by  using  Taylor  approxi- 
mations. For  any  A  and  x,y  such  that  A  =  \  \og(x/y)  is  not  too  large,  the  loss  function  ^(A)  = 
log  ((£fi)A  /(£-js^)  )  can  be  approximated  as  tfA(A)  m  A(l-A)A2/2.  Thus  \og{Ht/At)  w  -A2/2<r  and: 

(16)  \ogf^±\ae+(a  +  p  +  7-l)log(At)-{{a  +  m-^-l3)  +  l)^--(a  +  0)2t 
For  the  integrated  economy,  similar  derivations  lead  to: 

(17)  l0g(i     J  *0  +  («  +  /?  +  7-l)log(i*)-(«(l-a)  +  ^  +  ^^-a2t 

We  see  that  the  drag  on  each  economy's  growth  is  the  product  of  two  factors.  The  first  is  the  economy's 
efficiency  loss  per  unit  of  dispersion,  to  be  discussed  below.  The  second  is  the  current  variance  A2  or  A2 
of  the  human  capital  distribution. 

3.2     The  Short  Run 

We  first  ask  which  economy  grows  faster,  for  any  given  amount  of  heterogeneity.  In  other  words,  suppose 
that  at  time  t  —  0  previously  segregated  populations  become  integrated:  will  human  capital  at  t  =  1  be 
higher  or  lower?  From  (16),  the  efficiency  loss  in  the  segregated  case  is  C  ■  A2/2,  with: 

(18)  £  =  (a  +  /?)(l-a-/?)  +  7/<7 

The  intuition  is  clear:  losses  reflect  the  concavity  of  the  function  ha+l3  and  the  complementarity  l/<7  of 
agents'  inputs  in  the  aggregate  H,  which  has  weight  7.  In  an  integrated  economy,  the  corresponding 
reduction  in  growth  is  C  ■  A2/2,  where: 

(19)  £  =  a(l-a)  +  p/e  +  y/<r, 
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with  a  similar  interpretation  involving  returns  to  scale  at  the  family  rather  than  the  community  level,  and 
both  local  and  global  elasticities  of  substitution. 

Proposition  1  The  mixed  economy  has  higher  growth  in  the  short  run,  i.e.  for  any  given  amount  of 
heterogeneity,  if  and  only  if  C  >  C,  or: 

(20)  4>  =  0(l-2a-0-l/e)>O. 

When  2a  +  0  <  1,  the  education  production  function  is  less  concave  in  the  previous  generation's  human 
capital  under  integration  than  under  segregation.  Mixing  will  then  accelerate  growth  even  in  the  short 
run,  provided  there  is  enough  local  substitutability  so  that  the  poor  do  not  drag  Lt  too  far  below  the  per 
capita  endowment  At .  Such  is  clearly  the  case  if  the  local  spillover  operates  through  the  arithmetic  average 
(e  =  oo),  as  in  Glomm  and  Ravikumar  (1992). On  the  other  hand  if  Lt  is  a  geometric  average  (e  =  1)  as  in 
Borjas  (1992a),  the  mixed  economy  is  more  vulnerable  to  heterogeneity  than  the  segregated  one.  Finally 
when  2a  +  0  >  1,  mixing  tends  to  reduce  human  capital  accumulation  in  the  short  run.13  This  case  is 
quite  plausible  since  under  constant  returns,  a  +  0  +  7  =  1,  it  means  that  parental  human  capital  is  more 
important  to  a  child's  education  than  the  economy-wide  aggregate  Ht:  a  >  7. 

The  intuition  for  Proposition  1  is  best  understood  by  showing  how  <j>  embodies  the  effects  at  work  in 
standard,  static  models  of  matching.  By  definition,  mixing  is  inefficient  if  the  losses  of  the  rich  exceed  the 
gains  of  the  poor,  meaning  that: 

(21)  (A1-A1)/(H2/2)  =  (h*)°  ■  [{hW  -  (Lof]  -  (h*)°  ■  [(L0f  -  (A?)"] 


[itir  -  (fc?)"]  ■  [ctf  r  -  a?]  +  (A?r  •  [ctf  f + (*? y  -24]  +  Krtr  +  (^)l 


Ap  -  Tp 


13The  same  is  true  for  log(Hi/Hi)  w  fi  ■  (A2/2)  ■  ((1  -  la  -  /3)(1  -  1/cr)  +  \jb  -  1/t),  unless  1/<t  -  1/e  is  sufficiently  large. 
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is  positive.  The  first  term  arises  from  the  complementarity  of  parental  capital  and  local  inputs  (F12  >  0). 
It  is  positive  since  children  from  better  backgrounds  lose  more  from  a  given  decline  in  La,  such  as  from  h* 
to  the  per  capita  average  Aq.  For  small  dispersion  this  term  is  approximately  equal  to  2a/?  •  Aq  •  A2/2. 
The  second  term  comes  from  the  decreasing  impact  of  marginal  improvements  in  local  conditions  (F22  <  0). 
It  is  negative  since  an  extra  unit  of  human  capital  has  a  larger  impact  in  a  poor  community  than  in  a  rich 
one.  For  small  dispersion  this  term  is  close  to  —  (3(1  —  /?)  •  Aq+^  •  A2/2.  The  final  term  incorporates  the 
pure  losses  from  heterogeneity  in  generating  the  local  spillover  Lq  :  if  1/e  >  0,  poorly  educated  agents  drag 
down  Lq  more  than  well  educated  agents  pull  it  up,  so  Lq  <  Aq.  For  small  dispersion  this  last  term  is 
approximately  equal  to  (/?/e)  •  Aa+!3  ■  A2/2.  The  first  and  last  effects  tend  to  make  sorting  more  efficient 
than  mixing;  the  second  one  goes  in  the  opposite  direction.  Summing  all  three  yields  a  net  impact  of 
stratification  proportional  to  2a/?  —  /?(1  —  /?)  +  /?/e  =  —  <f>- 

3.3     The  Long  Run 

Because  mixing  equalizes  knowledge  faster,  the  drag  on  growth  due  to  dispersion  eventually  becomes 
smaller  in  the  integrated  than  in  the  segregated  economy,  as  shown  by  the  last  terms  in  (16)-(17).  But 
what  really  matters  is  whether  this  effect  is  sufficient  for  At  to  make  up  its  initial  handicap  and  overtake 
At-  To  answer  this  question,  we  solve  the  difference  equations  (16)  and  (17).  Denoting  R  =  a  +  /?  +  7  and 
Ct  =  6  ■  (1  -  i?')/(l  -  R),  we  have: 

log  (  ^  \    „  A2  Vfl'-i-<W/?V*-r     c  A2    Rt-(a  +  P) 

^Gi^)    -  ct-c:%p<-^  =  ct-c.%    R_a2 

Therefore: 


A2 

Rl- 

-(a 

+  P)2t 

2 

R- 

-(a 

+  /?)2 

B*- 

a2t 
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Proposition  2  For  any  7  >  0,  the  gap  between  the  integrated  and  segregated  economies  is,  for  t  large 

enough: 

(22)  iog^j„$.^(a  +  /j  +  7)« 

where: 

(a  +  /?)(l-a-/?)  +  T/<r      a(l  -  a)  +  0/e  +  7/tr 


(23)  $  = 


a  +  /?  +  7  -  (a  +  /?)2  a  +  /?  +  7  -  <* 


-«2 


In  the  long  run,  the  gap  shrinks  to  zero  if  total  returns  to  scale  R  =  a  +  0  +  7  are  less  than  one,  tends 
to  a  finite  limit  if  R  =  1,  and  explodes  if  R  >  1.  Equation  (23)  embodies  the  main  insights  of  the  paper. 
The  two  numerators  represent  each  economy's  instantaneous  losses  per  unit  of  heterogeneity.  The  two 
denominators  reflect  the  different  speeds  of  convergence  to  a  homogeneous  society.  The  tradeoff  between 
incurring  the  costs  of  local  heterogeneity  and  reducing  the  losses  from  aggregate  heterogeneity  at  a  faster 
rate  is  apparent  in  the  fact  that  $  is  decreasing  in  /3/e  and  increasing  in  f/o-.  But  an  additional,  less 
obvious  factor  is  involved:  the  difference  between  the  concavity  of  the  technologies  faced  by  a  community 
and  by  a  family,  adjusted  by  the  appropriate  convergence  speeds.  This  value  of  $  for  1/e  =  1/cr  =  0  is 
generally  positive,  as  will  be  seen  below. 

We  are  now  ready  to  answer  the  question:  when  is  the  long-run  human  capital  stock  larger  under 
mixing  than  under  segregation?  Let  us  start  with  a  useful  benchmark  case,  assuming:  (a)  R  =  1,  constant 
returns;  (b)  e  =  a:  heterogeneity  is  equally  costly  or  beneficial  at  the  local  and  economy-wide  levels, 
a  "neutrality"  assumption;  (c)  1/(7  <  1:  there  is  more  substitutability  within  the  composite  inputs  Lt 
and  Ht  than  between  the  three  inputs  h\,  Lt,  Ht.lA  Note  that  the  model  of  Section  2.1  imposed  all 
three  restrictions;  in  particular,  (c)  was  required  for  a  worker's  income  to  rise  with  her  human  capital. 
Assumption  (c)  also  seems  plausible  for  most  alternative  interpretations.  Parental  background,  peer  group 
quality  and  society's  general  level  of  knowledge  or  income  are  likely  to  be  poorer  substitutes  in  a  child's 
education  than  workers'  different  skill  levels  in  the  production  of  output  or  know-how.  In  the  benchmark 


14  When  F(h,L,H)  is  a  CES  function  with  elasticity  A,  the  relevant  comparison  is  l/<7  <  l/A;  see  the  appendix. 
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case  equation  (23)  becomes:  > 

(24)  t-at.xL«(tr)>1' 

so  that  integration  raises  the  long-run  level  of  human  capital  by  $  •  A2/2.15 

Figure  2  illustrates  what  happens  when  a  previously  integrated  economy  becomes  stratified  at  time 
to,  given  <j>  <  0  <  $  and  R  =  1.  The  common  trend  6  t  is  factored  out  from  all  variables,  making  them 
stationary.  Initially,  the  richer  A  agents  benefit,  while  the  poorer  B's  lose.  The  distribution  of  income 
worsens,  but  the  overall  impact  is  favorable  and  growth  accelerates.  Over  time,  however,  it  slows  down  due 
to  the  fact  that  society  remains  more  heterogeneous.  Eventually,  even  the  A's  accumulation  is  reduced, 
and  all  dynasties'  capital  stocks  converge  to  a  common  level  which  is  lower  than  what  it  would  have  been 
had  society  remained  integrated.  To  get  a  feel  for  orders  of  magnitude,  let  a  =  .5,  /?  =  .3,  7  =  .2  and 
e  =  a  —  00.  Let  A  =  1,  or  h* /h^  —  7.4;  this  corresponds  to  a  coefficient  of  variation  of  0.76.  The  secession 
of  the  rich  at  first  raises  growth  by  4.5%,  but  eventually  lowers  the  steady-state  path  of  the  economy  by 
5.6  %.  The  initial  boom  is  erased  two  generations  later.  Integrating  a  previously  segregated  society  leads 
to  the  converse  scenario,  with  an  initial  growth  slowdown  but  higher  steady-state  output. 

Of  course,  it  need  not  be  the  case  that  mixing  is  preferable  in  the  long  run.  We  now  examine  more 
generally  the  relative  performance  of  the  two  social  structures,  by  varying  key  parameters.  First,  note 
that  the  steady-state  gap  (24)  remains  positive  even  when  the  degree  of  economy-wide  linkage  7  tends 
to  zero.  The  impact  of  stratification  on  the  economy's  long-run  performance  is  thus  surprisingly  different 
when  rich  and  poor  are  completely  independent  from  one  another  and  when  their  fates  are  linked  to  an 
arbitrarily  small  extent.16  We  shall  come  back  to  this  feature  when  evaluating  each  dynasty's  welfare  later 
on.  Second,  we  show  in  appendix  that  decreasing  total  returns  R  <  1  raise  the  benefits  of  integration;  the 
converse  is  true  for  R  >  1.  Third,  and  most  importantly,  segregation  remains  preferable  in  the  long  run  if 


15Note  that  (24)  allows  for  inequality  to  be  beneficial,  i.e.    for   \/a  =  l/t  <  0.    When  all  externalities  operate  through 
geometric  averages  (Tamura  (1991a),  Borjas  (1992a)),  (24)  shows  that  mixing  and  segregation  lead  to  the  same  steady-state. 
On  the  other  hand  asa  +  /3  =  l— 7  approaches  one,  the  speed  at  which  the  segregated  economy  tends  toward  its  lower 
asymptote  approaches  zero;  see  the  expression  for  \og,(At/AoR')  in  appendix. 
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Figure  2:     The  short  and  long  run 
effects  of  stratification 
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disparities  in  knowledge  entail  sufficiently  greater  losses  at  the  community  level,  e.g.  in  schooling,  than  at 
the  aggregate  level,  e.g.  in  production.  Rearranging  (23),  with  R  =  1: 


(25) 


*>0  as  I_I<(l-lV_L^) 

<  e       a-  >    V        o)  \l+a  +  /3j 


The  two  regions,  illustrated  on  Figure  3,  are  quite  intuitive.  Perhaps  most  noteworthy  is  the  triangular 
area  between  the  boundary  and  the  two  axes;  since  1/a  <  0  <  1/e,  heterogeneity  creates  negative  spillovers 
at  the  local  level  but  positive  ones  at  the  aggregate  level.  Nonetheless,  mixed  communities  lead  to  a  superior 
long-run  outcome  due  to  the  differential  combination  of  returns  to  scale  in  accumulation  and  convergence 
speeds  discussed  earlier.  In  order  to  overcome  this  effect  and  make  $  <  0,  1/<t  must  be  sufficiently  larger 
than  1/e.  For  instance,  it  will  be  efficient  for  the  managerial  and  working  classes  to  live  separately  if  it 
is  much  easier  for  good  managers  to  make  up  for  poorly  qualified  workers  in  the  production  of  output, 
than  for  students  from  favorable  backgrounds  to  offset  the  effect  of  weaker  schoolmates  in  peer  interactions. 
Similarly,  it  will  be  optimal  to  sort  successive  generations  of  Ph.D.  students  into  departments  of  differential 
qualities  when  complementarities  are  stronger  during  graduate  studies  than  they  are  during  research  careers. 

3.4     Welfare  and  Pareto  Optimality 

It  is  easy  to  go  from  the  aggregates  At  or  At  to  each  group's  path  of  human  capital.  For  instance,  under 
segregation:  log(ft^)  =  log(At)  +  log(2/(l  +  e_2A<))  «  log(At)  +  A«  —  A^/2.  Given  a  specification  of 
preferences  and  of  the  relationship  between  skills  h\  and  income  y\,  for  instance  as  in  Section  2.1,  we 
could  compute  present  values  of  each  family's  utility  or  of  any  planner's  social  welfare  function.  But  the 
main  insights  are  clear  even  without  doing  so.  First,  if  integration  is  more  efficient  even  in  the  short  run 
(0  <  <j>  <  $),  it  will  lead  to  a  higher  value  in  each  period,  not  only  of  total  human  capital,  but  also  of  output 
and  of  any  social  welfare  function  which  is  a  CES  aggregate  of  individual  human  capital  or  consumption 
levels.  In  the  more  interesting  case  where  <f>  <  0  <  $,  social  welfare  will  be  higher  under  mixing  if  the 
planner  has  a  low  enough  discount  rate.    Moreover,  if  individual  agents'  discount  rate  p  is  high  enough, 
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integration  will  be  Pareto  improving  even  without  any  redistribution,  because  in  the  long-run  all  agents 
are  identical  and  share  the  same  level  of  human  capital  Aoo  or  Aoo . 

4     Random  Ability,  Stratification  and  Long  Run  Growth 

The  previous  model  allowed  us  to  draw  the  essential  distinction  between  the  short  and  the  long  run  effects  of 
stratification,  and  to  identify  the  main  parameters  which  determine  how  heterogeneity  affects  growth.  But 
it  had  two  drawbacks.  First,  the  long-run  distribution  of  income  was  always  degenerate  -unless  a  +  ft  >  1, 
but  then  R  >  1  implies  explosive  growth.  This  is  clearly  contrary  to  the  evidence.  Second,  the  way  in 
which  the  economy  was  stratified  had  no  effect  on  the  long  run  growth  rate,  except  again  in  the  case  of 
increasing  returns;  see  (22). 

In  this  section  we  solve  both  problems  by  incorporating  random  shocks  to  children's  ability  or  uncertain 
returns  to  education,  as  in  Loury  (1981).  Such  random  draws  of  luck  constitute  a  permanent  source  of 
inequality,  but  also  of  social  mobility:  the  relative  rankings  of  any  two  dynasties  will  no  longer  be  preserved 
forever,  but  will  change  with  positive  probability.  Because  a  mixed  society  "undoes"  inequality  faster,  it 
will  have  a  less  dispersed  asymptotic  distribution  of  human  capital  and  income  than  a  segregated  one. 
On  the  other  hand,  it  may  still  be  less  efficient  at  processing  any  given  amount  of  heterogeneity.  Under 
constant  returns,  the  balance  of  these  two  effects  will  determine  which  of  the  integrated  or  segregated 
economies  has  the  higher  long-run  growth  rate. 

4.1      Dynamics 

Let  the  accumulation  of  human  capital  be  given  by  (8),  where  L\  and  Ht  are  defined  as  in  (11)-(12) 
and  the  shocks  Q  are  i.i.d.  with  log(£J)  ~  Af(0,s2).  Assuming  independent  shocks  involves  little  loss 
of  generality,  since  intergenerational  correlation  of  ability  is  already  captured  by  the  h\  term.  We  also 
take  the  initial  distribution  of  human  capital  to  be  log-normal:  log(/i'0)  ~  M"(m,  A2).  The  advantage 
of  this  specification,  which  builds  on  Glomm  and  Ravikumar's  (1992)  deterministic  model,  is  that  h\ 
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remains  log-normally  distributed.  This  allows  CES  aggregates  and  loss  functions  to  be  computed  exactly: 
if  log(/i;)~^K,At2),  then:  **(At)  =  log  ((/0°°  hdpt(h))X  //~  AAd/i,(fc))  =  A(l  -  A)  •  A?/2.  Setting 
A  =  (<r  —  l)/«r  yields: 


(26) 


Ht  =  {£  h^  <*(*))  '"  =  «P  (m,  +  (^)  f )  =  A,  •  eXp  [-§) 


Now  consider  human  capital  at  time  t+1.  Taking  logarithms  in  (8)  with  L\  =  h\  (segregation): 


(27)  log(A»+i)  =  9  +  log(C!)  +  (a  +  /?)  log(AJ)  +  7(mt  +  2—1 .  M) 


Human  capital  at  time  t  +  1  is  therefore  also  log-normally  distributed:  log(/iJ+1)  ~  A/"(mt+i,  A2+1),  with 


(28) 


mt+1     =0  +  R-mt+y(Zf±)£ 
A?+1     =  (a  +  /?)2A2  +  s2 


Integration  yields  similar  expressions,  with  a  +  0  replaced  by  a  and  /?  (e  —  l)/e  added  to  y  (<r  —  1)/<t. 
The  steady-state  variance  of  human  capital  is  then  A2^  =  1*<>a  ,  which  is  lower  than  A2^  =  1_/^,g\i  ,  as 
expected.  We  could  solve  (28)  for  the  mean  of  log-human  capital  mt  at  any  point  in  time.  But  in  order  to 
make  the  losses  from  heterogeneity  appear  most  clearly,  it  is  better  to  track  once  again  the  behavior  of  total 
human  capital  At  =  f0     hdfit(h).  Using  (26)  to  (28),  we  obtain  the  growth  rates  of  the  two  economies: 


(29)  log(^f)     =     e+J  +  (R-tyog(At)-^-((a  +  0)(l-<*-P)+l) 

(30)  log(^f)     =     9  +  j  +  (R-l)\og(At)-^-(a(l-a)  +  ^  +  l 


These  expressions  are  identical  to  (16)  and  (17),  up  to  a  constant.  In  particular,  we  recognize  the  loss 
factors  C  =  (a  +  /?)(1  —  »  —  /?)  +  j/a  and  £  =  a(l  —  a)  +  0/e  +  y/cr  for  each  economy,  multiplied  by  their 
respective  variances.  The  comparison  between  the  two  social  structures'  efficiency  at  aggregating  levels  of 
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knowledge,  i.e.  between  their  capital  stocks  one  period  after  starting  from  the  same  initial  conditions,  is 
therefore  unchanged:  log(ii/>li)  =  /?(1  -  2a  -  0  -  1/e)  •  A2/2  =  <f>  ■  A2/2. 

To  examine  the  impact  of  stratification  on  long-term  output  and  growth,  we  solve  (28)-(30)  for  (At  ,At) 
and  (At ,  At);  see  the  appendix.  We  first  consider  the  case  where  initial  endowments  are  the  only  source 
of  inequality. 

Proposition  3  :    The  effect  of  initial  inequality.  If  s2  =  0  then  for  any  j  >  0  and  t  large  enough: 


<»»Mr«n^  )■£■*-£  ■* 


This  is  exactly  the  same  expression  as  in  the  two-group  case,  so  all  the  results  derived  previously  extend 
to  this  model. 

Proposition  4  :    The  effect  of  ongoing  inequality.  If  s2  >  0,  then  for  large  enough  t: 


log( 


&~l('-±-'M¥kHT=fa-iiv)-Hi£s) 


Under  constant  returns,  the  integrated  economy's  long  run  growth  rate  exceeds  that  of  the  segregated  economy 
by  $  •  s2/2. 

As  seen  earlier,  $  >  0  unless  the  cost  of  local  heterogeneity  exceeds  that  of  aggregate  heterogeneity 
by  a  sufficient  margin.  Proposition  4  shows,  quite  intuitively,  that  recurrent  inequality  impacts  the  two 
economies  one  level  higher  than  initial  dispersion  does.  When  R  =  1,  s2  affects  long-run  growth  rates  in 
the  same  way  as  A2  affects  long-run  levels.  When  R  <  1,  s2  impacts  long-run  levels  whereas  the  effect  of 
A2  vanishes  asymptotically. 
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4.2     Discounting  and  Welfare  ■» 

Let  us  now  examine  the  welfare  of  individual  families,  asking  in  particular  whether  integration  can  bring 
about  a  Pareto  improvement  without  redistribution.  To  derive  simple,  closed  form  expressions,  we  assume 
that  agents  have  logarithmic  utility  and  compute  the  expected  present  value  of  log-human  capital  for 
each  family.  Note  that  if  labor  income  depends  not  only  on  own  human  capital  but  also  on  aggregate 
productivity,  as  in  (6),  this  will  understate  the  relative  benefit  of  integration  with  respect  to  segregation.17 
Finally,  we  use  the  benchmark  specification  1/e  =  1/cr  <  1  and  R  =  1.  It  should  be  clear  from  Section  3.3. 
in  which  direction  deviations  from  this  case  will  pull  the  results. 

Straightforward  but  tedious  derivations  allow  us  to  derive  family  z's  distribution  of  human  capital  at 
any  time  t,  and  ultimately  its  expected  intertemporal  welfare,  conditional  on  its  initial  endowment.  Taking 
differences  between  the  mixed  and  stratified  cases  yields: 


(31)  U'0-U'0    =    E0 


£>*  lQg(fcj/AJ)  I  *'o 


t=o 


p/3{m  -  log(/i'0)) 


(1  -  pa)(l  -  p(a  +  /?)) 

,       (     P        a~l     PS2  +  (W)A2     (A±2 T  \A 

\l-p'     *  2(1  -p)  \l-pa*       l-fa  +  0)2)) 

The  first  term  reflects  the  impact  of  the  initial  endowment;  ceteris  paribus,  dynasties  which  start  above 
the  mean  lose  from  integration,  while  those  which  start  below  the  mean,  gain.  The  second  term  is  always 
positive,  reflecting  the  fact  that  ceteris  do  not  remain  paribus:  integration  raises  the  level  (through  A2) 
and  possibly  the  growth  rate  (through  s2)  of  the  unconditional  average  of  (log)  human  capital.  Since  this 
is  the  expectation  of  the  asymptotic  distribution  facing  each  dynasty,  it  makes  all  better  off  to  an  extent 
which  reflects  their  degree  of  patience  or  intergenerational  altruism.  As  p  tends  to  one,  so  does  the  fraction 
of  dynasties  made  better  off  by  integration.  Since  in  practice  the  distribution  of  human  wealth  at  any 
point  in  time  has  finite  support,  integration  is  for  all  practical  purposes  Pareto  improving  if  agents  are 
sufficiently  patient.  Note  that  this  has  nothing  to  do  with  any  insurance  effect:  in  (31),  dynasty  i's  own 


'Assuming  1/cr  >  0;  the  bias  is  reversed  when  \/a  <  0. 
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shocks  log(Q)  are  set  to  their  expected  value  of  zero  under  both  integration  and  segregation. 

Let  us  next  examine  the  role  played  in  this  result  by  the  global  externality  H^  which  ties  together  the 
accumulation  paths  of  rich  and  poor  (due  for  instance  to  complementarity  in  production,  as  in  Section 
2.1).  Suppose  that  this  link  becomes  very  weak:  7  tends  to  zero  and,  maintaining  R  =  1,  at  +  f3  tends  to  1. 
Recall  from  (24)  that  the  asymptotic  difference  in  growth  rates  $  ■  s2/2  or  in  levels  $  •  A2/2  between  the 
two  economies  remains  positive  in  the  limit;  on  the  other  hand,  the  speed  (a  +  /?)'  at  which  the  stratified 
economy  converges  to  its  inferior  trajectory  slows  down  to  zero.  Equation  (31)  incorporates  these  two 
opposing  effects  into  discounted  present  values;  when  7  goes  to  zero,  it  becomes: 


-\og(hj)       (<r-l\  f   1-a   \  //>s2  +  (l-/>)A2\j 
1-pa      +\    <r    J  \l-pofl)  V        2(l-p)        )\ 


(32)  u'-U^PJl^l 


In  the  limit,  integration's  long  run  effect  on  the  unconditional  mean  of  (log)  human  capital  still  con- 
tributes positively  to  each  dynasty's  net  welfare.  When  initial  endowments  are  the  only  source  of  inequality, 
s2  =  0 ,  this  is  only  a  level  effect ,  so  only  a  bounded  fraction  of  the  population  has  a  positive  net  gain  Uq  —  Uq, 
for  any  value  of  the  discount  factor.  Integration  is  not  Pareto  improving  unless  richer  families  receive  com- 
pensating transfers.  When  s2  >  0,  however,  mixing  has  a  growth  rate  effect.  This  dominates  any  level 
effect  from  initial  conditions,  provided  the  discount  factor  is  high  enough.  Thus  we  see  that  once  again 
the  fraction  of  net  gainers  becomes  arbitrarily  close  to  one  as  p  tends  to  one.  If  agents  are  patient  enough, 
integration  makes  almost  all  families  better  off,  even  with  an  arbitrarily  small  degree  of  complementarity 
between  rich  and  poor.18 


18The  formal  statement  corresponding  to  (31)  is:  V/i0,  V"K  >  0,  3p(ho,f)  such  that  if  p  >  p(ho,~f)  all  dynasties  starting  with 
k'0  <  ho  are  better  off  under  integration.  The  stronger  statement  corresponding  to  (32)  is:  V^o,  3p*(ho)  such  that:  V7  >  0, 
if  p  >  p*(ho)  then  all  dynasties  starting  with  h'0  <  ho  are  better  off  under  integration. 
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5     Applications  of  the  General  Model 

5.1     Local  versus  National  Funding  of  Public  Education 

What  do  the  results  of  the  general  model  imply  for  the  relative  efficiency  of  locally  and  nationally  funded 
public  schooling?  Recall  from  the  model  of  Section  2.1  that  if  6  and  1  —  6  are  the  weights  of  parental 
education  and  school  expenditures  in  a  child's  human  capital,  local  funding  (E\  =  r  ■  Yt')  is  equivalent  to 
income  segregation  in  the  general  model  (8),  and  national  funding  (E\  =  r  -Yt)  equivalent  to  integration, 
with:  e  =  <r>l,  a  =  6,  /?  =  (1  -  6)(a  -  l)/<x,  y  =  (1  -  6)/a  and  R=l.  Therefore,  by  (20)  and  (24): 

Proposition  5  (1)  National  funding  of  education  leads  to  slower  human  capital  accumulation  than  local 
funding  in  the  short  run,  since: 

*  =  -6(l-S)(l-js)  <0- 

The  same  is  true  for  output  growth  if  6(1  +  cr)  >  l.19 

(2)  National  funding,  however,  is  more  efficient  in  the  long  run,  since: 


$=(2T4TTi)(ttI)(^1)2>0 


When  initial  endowments  are  the  only  source  of  inequality,  national  funding  raises  the  long  run  levels  of 
human  capital  At  and  output  Yt  by  $•  A2/2.  When  children's  ability  or  returns  to  education  are  uncertain, 
it  raises  the  long  run  growth  rate  of  human  capital  and  output  by  $  •  s2/2. 

The  higher  is  agents'  intergenerational  discount  factor,  the  larger  the  proportion  of  them  who  would  vote 
for  a  national  rather  than  a  local  system;  for  a  high  enough  p,  there  would  be  unanimity.  Being  derived 
from  a  very  simple  model,  Proposition  5  should  of  course  be  taken  with  caution.  In  particular,  one  should 
keep  in  mind  two  important  maintained  assumptions. 


'Using  (26)  yields:   log^i/Vi)  =  (1  -  6(1  +  <r))  ■  (1  -  S)  ■  (<r  -  l)2  •  (A2/2<r2). 
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First,  tax  rates  are  kept  constant  across  the  two  regimes.  In  practice,  richer  families  may  respond  to  the 
redistribution  inherent  in  national  funding  by  voting  for  lower  taxes;  on  the  other  hand,  agents  are  more 
likely  to  internalize  economy-wide  spillovers  when  voting  on  a  national  education  budget  rather  than  that 
of  their  school  district.  Richer  families  could  also  withdraw  their  children  from  the  public  school  system, 
thus  preserving  or  restoring  stratification.  Variations  in  tax  rates  or  a  switch  to  private  education  will 
essentially  be  reflected  in  different  values  of  the  constant  factor  0.  This  is  examined  in  the  next  section. 

Second,  the  underlying  source  of  inefficiency  in  local  funding  is  the  absence  of  a  capital  market  where 
poor  communities  can  borrow  from  richer  ones  to  finance  schools.  National  equalization  of  expenditures 
amounts  to  a  partial  and  gradual  redistribution  of  human  capital,  with  some  payback  to  the  rich  in  the 
form  of  a  higher  Ht.  Given  decreasing  returns  to  dynastic  accumulation  (a  +  P  —  1  —  (1  —  6)/<r),  it  is 
intuitive  that  it  should  increase  aggregate  efficiency.  So  what  is  perhaps  most  surprising  is  that  this  is  only 
true  in  the  long  run:  early  on,  output  and  human  capital  accumulation  may  actually  be  reduced,  as  rich 
families  lose  more  than  poor  ones  gain.  Also  unexpected  is  the  result  that  the  steady-state  gain  from  a 
national  scheme  remains  finite  even  when  dynasties  face  returns  arbitrarily  close  to  one:  as  workers  become 
almost  perfect  substitutes,  /im(T_t0O($)  =  |  ■  (1  —  6)/(l  +  6)  >  0. 

5.2     Public  versus  Private  Education 

Let  us  now  consider  privately  purchased  education,  along  the  lines  of  Glomm  and  Ravikumar  (1992).  In 
the  model  of  Section  2.1,  we  simply  replace  equation  (4)  by: 

(4')  e\=ri.y\ 

where  t\  now  represents  the  fraction  of  her  income  which  adult  i  spends  on  her  child's  education,  and  the 
corresponding  input  e|  takes  the  place  of  the  per  capita  school  budget  E\  in  the  accumulation  equation  (3). 
We  again  keep  the  preferences  side  of  the  model  as  simple  as  possible  by  assuming  logarithmic  utility. 
We  also  assume  log-normally  distributed  initial  conditions  and  shocks,  as  in  Section  4,  or  a  small  enough 
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dispersion  that  the  Taylor  approximations  of  Section  3  are  legitimate.  With  these  conditions,  we  prove  in 
appendix: 

Proposition  6    Under  private  education,  the  fractions  1  —  v  and  t  of  their  time  and  income  which  adults 
devote  to  their  children  are  given  by: 


v  =  1  —  p8  ,  T  = 


-P(l-g) 
1-pS 


m 


Under  national  public  education,  adults'  time  allocation  is  unchanged.  The  tax  rate  unanimously  preferred 
by  voters  is  r*  or  f,  depending  on  whether  or  not  they  internalize  the  complementarity  of  human  capital 
levels  in  output  Yt  =  v  ■  Ht-' 


f-  Pi1-6)  ("-1)    <    T    <    Kl-*)_r. 


1-*)  (*-l\ 

-6)(a-l)/a     \    a    J 


l-p6  +  p(l-6)(a-l)/(T     \    o-    J  \-p6 

The  reason  why  f  <  r  is  that  the  private  marginal  value  of  human  capital,  hence  also  the  return  on 
savings,  is  higher  under  private  than  under  public  education.  This  is  because  in  the  first  case,  an  extra 
unit  of  human  capital  enables  the  adult  to  not  only  consume  more,  but  also  to  buy  more  education  for  her 
offspring.  This  is  very  similar  to  the  effect  identified  by  Glomm  and  Ravikumar  (1992),  who  allow  the  young 
a  choice  between  leisure  and  effort  at  studying,  and  show  that  they  work  harder  when  education  is  privately 
purchased.  In  our  model,  time  is  allocated  not  between  studying  and  leisure  but  between  production  and 
at-home  education,  both  of  which  allow  adults  to  pass  on  more  instruction  to  their  offspring.  The  difference 
in  marginal  values  of  human  capital  therefore  shows  up  not  in  different  values  of  i>,  but  in  different  preferred 
savings  rates.  More  importantly,  this  implies  that  private  education  need  not  lead  to  higher  investment 
in  human  capital.  Whether  the  accumulation  factor  6  under  private  education  is  higher  or  lower  than  its 
counterpart  0  in  a  publicly  funded  system  depends  on  whether  f  =  f  or  f  —  r* . 

In  principle,  voters  should  realize  that  a  marginal  increase  in  any  agent's  human  capital  allows  an 
increase  not  only  in  her  own  consumption  (as  in  the  case  leading  to  f),  or  even  in  that  of  all  her  dynasty 
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(as  under  private  education,  leading  to  r),  but  in  that  of  all  dynasties.  If  they  do,  they  will  base  their  vote 
on  the  social  rather  than  the  private  return  to  educational  investment,  leading  to  r*  >  r.  On  the  other 
hand,  private  underinvestment  in  education  could  also  be  addressed  by  uniformly  taxing  consumption  and 
subsidizing  education.  This  would  be  equivalent  to  raising  r  without  going  to  publicly,  i.e.  uniformly 
funded  education.  For  this  reason,  but  also  to  highlight  the  effects  of  heterogeneity,  the  case  0  >  0  might 
still  be  considered  the  most  likely  one,  as  we  turn  to  comparing  growth  rates.  Proposition  6  shows  that 
private  education  is  equivalent  to  locally  funded  public  education  in  a  segregated  economy  (L\  =  h\), 
except  for  the  value  of  0.  Therefore  we  have,  with  <f>  <  0  and  $  >  0  still  given  by  Proposition  5: 

Proposition  7  Let  0  and  0  denote  the  trend  factors  reflecting  the  differential  incentive  effects  of  private 
and  national  public  education.  Public  education  is  less  efficient  in  the  short  run  if  0  —  0  >  <j>  ■  A2/2,  but 
leads  to  faster  growth  in  human  capital  and  output  in  the  long  run  if  $  •  (s2/2)  >  0  —  0. 

This  result  formalizes  some  of  the  main  arguments  in  the  debate  about  public  versus  private  education 
or  related  voucher  schemes,  at  least  where  efficiency  is  concerned.  The  key  issue  is  the  relative  importance 
of  incentive  and  stratification  effects,  and  perhaps  also  the  relevant  time  horizon.  Our  results  also  suggest 
that  in  the  long  run,  both  private  and  national  systems  dominate  locally  funded  public  education,  which 
has  neither  the  incentive  properties  of  the  former  nor  the  homogenization  properties  of  the  latter.  Of  course 
these  conclusions  are  based  on  a  simple,  stylized  model  and  should  be  taken  with  caution.  But  the  model 
is  indicative  of  the  major  forces  at  play  in  each  case. 

It  is  also  interesting  to  relate  our  results  to  those  of  Glomm  and  Ravikumar  (1992),  whose  work  we 
have  been  building  on.  Their  model  leads  to  the  reduced  form  h't+1  =  0  •  (h't)a+P  under  private  education, 
and  h\+1  =  0  •  (h\)a  (At)13  under  public  education.  This  is  another  special  case  of  our  general  model, 
with  7  =  0  (no  global  interaction),  e  —  oo  (perfect  substitutability)  and  s2  =  0  (no  shocks).  Glomm  and 
Ravikumar  observe  that  if  there  is  enough  inequality,  the  economy  can  temporarily  experience  faster  growth 
under  public  education,  provided  2a  +  /?  <  1;  but  they  do  not  explain  why.  The  analysis  of  the  short  run 
effects  of  segregation  in  (22)  provides  the  answer:  when  2  a/3  <  /?(1  —  /?),  the  complementarity  between 
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parental  background  and  school  quality  is  dominated  by  the  decreasing  returns  to  quality.  This  makes'the 
accumulation  equation  under  public  schooling  (integration)  less  concave  in  parents'  human  capital  than  its 
counterpart  under  private  schooling  (segregation).  Turning  to  the  long  run,  the  asymptotic  growth  rate 
in  Glomm  and  Ravikumar's  model  reflects  incentives  only,  as  the  effects  of  initial  inequality  A2  vanish 
over  time  (for  a  +  (5  <  1);  it  therefore  always  higher  under  private  education.  Proposition  7  shows  that 
the  situation  is  quite  different  when  one  allows  for  ongoing  sources  of  heterogeneity  such  as  random  of 
ability  or  parental  altruism,  unpredictable  obsolescence  of  specialized  skills  etc.,  and  for  some  degree  of 
complementarity  in  production,  no  matter  how  small. 

5.3     Immigration 

For  many  countries,  the  immigration  of  workers  and  their  families  with  lower  levels  of  education  than  the 
resident  population  constitutes  another  periodic  or  ongoing  source  of  heterogeneity.  The  unification  of 
East  and  West  Germany  provides  another  example.  What  the  results  of  this  paper  tend  to  show  in  this 
context  is  the  following.  The  economy's  long  run  performance  is  likely  to  be  superior,  benefiting  all  family 
lines,  if  immigrants  and  their  descendants  are  integrated  rapidly,  meaning  that  they  share  neighborhoods, 
schools  and  other  public  goods  with  the  richer  local  population,  than  if  they  remain  isolated  in  homoge- 
neous "ghettos".20  However,  integration  may  well  have  a  negative  impact  on  the  first  few  generations  of 
established  residents.  Their  individual  incentives  will  always  push  society  toward  segregation.  Whether 
they  will  collectively  (through  the  political  process)  recognize  and  seize  the  long-run  benefits  of  integration 
will  depend  on  how  they  discount  the  welfare  of  future  generations. 


20  See  Burda  and  Wyplosz  (1991)  for  a  model  of  migration  flows  with  human  capital  spillovers.  One  possible  motive  for  the 
richer  country  to  accept  immigration  or  unification  may  be  the  standard  gain  from  increasing  specialization  achievable  with 
a  larger  population;  see  Tamura  (1991b). 
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6     Conclusion 

The  model  developed  in  this  paper  is  extremely  simple.  The  acquisition  of  human  capital  reflects  family, 
community  and  economy-wide  effects.  The  accumulation  equation  is  general  enough  to  encompass  the 
reduced  forms  of  most  previous  models.  The  degrees  of  complementarity  or  substitutability  in  local  and 
economy- wide  interactions  capture  the  direct  costs  or  benefits  of  heterogeneity  at  each  level. 

In  spite  of  its  simplicity,  this  framework  allowed  us  to  study  several  important  issues  and  derive  many 
results.  We  examined  how  economic  stratification  affects  growth  and  welfare,  and  showed  in  particular 
that  integration  may  slow  down  growth  in  the  short  run  but  promote  it  in  the  long  run.  We  also  compared 
the  performance  of  locally  and  nationally  funded  public  education  systems,  which  exhibited  the  same 
intertemporal  tradeoff.  Introducing  private  education  lead  to  an  additional  tradeoff  between  incentive  and 
stratification  effects. 

The  model  could  be  extended  in  a  number  of  directions.  For  instance,  it  would  be  interesting  to  refine 
the  production  and  labor  market  side  of  the  model,  by  distinguishing  occupations  which  play  asymmetric 
roles  in  the  production  process,  such  as  managers  and  workers.  This  would  introduce  an  optimal  degree  of 
human  capital  inequality  in  the  labor  force, which  could  then  be  related  to  those  generated  by  stratification, 
integration  and  alternative  education  systems.21  The  interplay  between  local  complementarities,  clustering 
and  global  interactions  also  seems  like  a  promising  direction  for  future  research.  In  Benabou  (1991)  this 
idea  allowed  us  to  study  the  social  structure  and  productivity  of  cities.  In  this  paper  we  extended  it  to 
a  dynamic  framework  with  heterogeneous  agents,  and  to  the  study  of  education  funding.  It  should  be 
applicable  to  a  variety  of  other  problems. 


Recall  that  under  symmetry,  the  optimal  degre  of  inequality  is  either  zero,  when  agents  are  complements  (the  more 
heterogeneity,  the  lower  is  Ht),  or  infinity,  when  they  are  substitutes  (the  more  heterogeneity,  the  higher  is  J/t). 
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Appendix  A:  More  General  Education  Technologies 

A.l      Complementarity  Within  and  Between  Inputs 

The  Cobb  Douglas  specification  (8)  is  convenient  but  has  no  special  theoretical  or  empirical  justification. 
Moreover,  it  allows  stratification  to  have  long-run  effects  only  when  1/<t  or  1/c  differ  from  one.  Intuitively, 
what  should  matter  are  the  relative  values  of  the  elasticities  of  substitution  operating  within  the  aggregates 
L\  and  Ht,  and  between  the  three  inputs  entering  into  h\+1  =  Q  ■  F(h\,  L\,Ht).  We  show  here  that  such  is 
the  case.  Let: 
(A.l)  h\+1  =  0  -C*  •  [a'(h\)^  +  P{L\)^  +Y(Ht)^\  x-' 


where  the  weights  a',  ft',  and  j'  sum  to  one,  R  >  0  is  any  degree  of  returns  to  scale  and  A  can  take  any 
sign.  When  A  tends  to  one  we  obtain  (8)  in  the  limit,  with  a  =  Ra',  /?  =  Rf3'  and  7  =  Rj'.  Using  Taylor 
approximations  similar  to  those  of  Section  3,  one  can  show  that  the  loss  factors  for  the  segregated  and 
integrated  economies  are: 

(A.2)  C     S     R^  +  n\-f-^+W  +  nM-R)  +  ^) 

(A.3)  c  s  *  (-3^22 +«.(!_*,+£+£) 

For  instance  in  C,  heterogeneity  is  costly  because  of:  (a)  concavity  of  a  child's  human  capital  in  that  of  her 
parent,  due  to  a'  <  1  and  to  the  complementarity  of  the  three  inputs  into  her  education,  1/A  >  0;  if  h,  L  and 
H  are  substitutes,  on  the  other  hand,  heterogeneity  is  beneficial;  (b)  decreasing  total  returns,  which  make 
F(-)  more  concave;  R  >  1  has  the  opposite  effect;  (c)  complementarity  within  Lt\  (d)  complementarity 
within  Ht. 

In  the  short  run,  mixing  raises  the  growth  rate  if  <j>  >  0,  where:. 


(A.4)  <j>  =  C-t  =  Rp'  (l     2"'      t  +  (2a'  +  /?')(!  -  R)  -  j 
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The  interpretation  is  similar  to  that  of  (20),  to  which  (A.4)  reduces  when  A  =  1.  Equation  (A.4)  also  shows 
that  decreasing  total  returns  make  mixing  more  efficient.  We  now  turn  to  the  long  run.  To  a  second-order 
approximation,  the  law  of  motion  (28)  for  A2  remains  unchanged,  and  similarly  for  A2.  Thus: 

Proposition  8  The  gap  between  the  integrated  and  segregated  economies  is  given  by  the  same  expressions 
as  in  Propositions  2,  3  and  4,  with  C  and  C  now  given  by  (A. 2)  and  (A. 3),  and: 

_    <«+/.Xl—g)  +  (Q  +  fl,  (^1)  (  1-H)  +  X         gg  +  a2  (A^)  (1^)  +  f  +  X 
^         ;  ~  Q  +  /?  +  7-(a  +  /3)2  a  +  0  +  y-a2 


We  see  that  $  =  0  when  A  =  e  =  a  and  either  A  =  1  or  R  =  1 .  This  is  the  only  case  where  the  manner  in 
which  agents  are  partitioned  has  no  steady-state  effect.  Intuitively,  aggregation  then  causes  similar  losses 
whether  it  occurs  at  the  level  of  a  child's  education  (within  F(h,  L,  H)),  of  a  community  (within  L)  or  of 
the  whole  economy  (within  H).  We  can  also  rewrite  $  as: 


(A.6)     $  =       ° 


R-a2 


L_}l+(}--}l\     7(1 -«)  +  (« +  /?)(!  ~R)   ,   (,       l\     (l-fl)(2a  +  /?) 


a      e       \\      aj  R-(a  +  0)2  \        \)       i?-(o  +  /3)2 


/         1\     (l-fl)(2a 
+  V      \)'     R-Xa  +  l 


As  we  saw  in  the  case  A  =  1,  greater  complementarity  at  the  aggregate  than  at  the  local  level  (  \/<r  >  1/e) 
tends  to  make  integration  superior  in  the  long  run.  We  now  see  that  greater  complementarity  (i.e.  a  higher 
cost  of  disparity)  between  (h\,L\,Ht)  than  within  Lt  or  Ht  (1/A  >  1/a)  has  a  similar  effect.  In  particular, 
this  is  why  $  >  0  when  A  =  1  and  1/e  =  \/<r  =  0,  as  shown  on  Figure  3.  Finally,  non-increasing  returns 
to  scale,  R  <  1  also  tend  to  make  integration  beneficial,  except  when  1/A  is  sufficiently  greater  than  one. 

A. 2     Multiple  Spillovers 

Denote  by  ejt  and  an  respectively  the  elasticities  of  substitution  of  the  L'k  t  's  and  Hn,t  's  entering  (10'), 
and  by  /?fc,7„  their  weights  (partial  elasticities)  in  F.  Then,  whether  F(-)  is  Cobb-Douglas  as  in  Section 
3  or  CES  as  above,  equations  (18)-(19)  or  (A.2)-(A.3)  show  that  (10')  is  equivalent  to  (10)  with  /?  = 


34 


K  N 

(A.7)  /?/£  =  £>*/<*,  j/<T  =  Y,lJ"n. 

fc=l  n=l 

Appendix  B:  Proofs 

Proof  of  Propositions  (3)  and  (4). 

Solving  (28)  to  (29)  leads  to:  A2  =  A2*,  +  (a  +  /?)2<  (A2  -  A*,) ,  A2  =  A^  +  a2t  (A2  -  A2*,)  and: 

<-)    *(&)  -  K-f)(^)-f<^>«±f) 

m  *(&)  ■  K-<4)G=S)-£ ■(*-*)(*=?) 

hence  the  results  as  <  — ►  oo,  for  any  7  >  0. 

Proof  of  Proposition  (6). 

1.  Technology.  We  first  fill  in  a  few  details  with  respect  to  the  production  sector  of  Section  2,  so  as  to 
compute  labor  income  for  any  choice  of  hours  worked  v\.  We  drop  time  subscripts  for  simplicity.  Output 
is  produced  by  competitive  firms  with  constant  returns,  according  to  the  technology: 

(B.3)  Yt=  (J  (x^dsY 

where  xs  is  the  quantity  of  intermediate  input  s.  The  output  sector's  inverse  demand  curve  for  s  is  then 
p,  =  (x,/Y)~  ' ,  yielding  revenue  y,  =  p,  ■  x,  =  (x,)~z~  ■  (Y)*  .  We  assume  that  workers  must  specialize 
in  a  single  input;  each  then  chooses  a  different  one,  and  we  can  replace  the  index  s  by  i  £  fi.  ah  agent 
with  human  capital  h'  working  v'  <  1  hours  produces  x'  =  vx  h*  units,  and  earns  j/'  =  (i/'  h')~z~  ■  (Y)* . 
This  is  her  net  income,  as  production  entails  only  the  opportunity  cost  of  time  (e.g.  the  value  of  leisure, 
or,  in  our  case,  parenting).  Labor  income  also  takes  the  form  y\  =  vl  w',  where  w'  =  p'  h'  =  dY/di/'  can 
be  called  the  hourly  wage;  hence  (2).   When  v\  —  v  for  all  i  G  fi,  we  obtain  equations  (5)  (given  N  =  1 
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agents  in  fl;  otherwise,  Y  should  be  multiplied  by  N7^),  and  (6).  > 

1.  Private  education.  To  simplify  the  exposition,  we  first  assume  that  all  agents  except  i  choose 
invariant  participation  and  savings  rates:  i/f  =  v  and  t{  —f.  We  later  show  that  these  restrictions  are  not 
binding.  The  Bellman  equation  for  agent  i  is  then: 


(B.4) 


Wt(h\)     =     max{l,tT){log{il-T).(i,h\)'-^(Yt^) 

+     p-EWt^^.Q-dl-^hiY-ir^h^iY^)1-6)} 


The  first  order  conditions  are: 


(B.5) 
(B.6) 


v\  \    a    ) 


i«+i 


dWt+i 
dh 


(ftj+i) 


l-rt* 


•£■ 


1+1  '     dh     (  t+l) 


Since  the  state  of  the  economy  is  characterized  by  the  Markov  process  (At,  A2),  we  guess  the  form  of  the 
value  function  as:    Wt(h)  =  a  ■  \og(h)  +  6  •  log^j)  —  c  •  A^  +  d.  This  leads  to  j/J  =  v,  t\  =  r,  with: 


(B.7) 


l  +  p(l-6)a 


1  +  p(l-  S)a  +  p6aa/(a  -  1)    ' 


T  = 


p(l-6)a 
l  +  p(l-8)a 


Equilibrium  then  requires  that  v  =  v  and  t  —  t.  Replacing  in  (B.4)  and  using  the  recursion  equations  (28) 
and  (29)  for  At+i  and  A^+1  identifies  the  constants.  In  particular: 


fB81         a=  (*-!)/* 

V    '  ;  1  -  p6  -  p(l  -  6)((r  -  l)/a  ' 


a  +  b  = 


1 


l-p' 


b       pC  +  (l-p)/<r 
2a2'   l-p(a  +  /3)2     ' 


where  a  =  S,  /?  =  (1  -  6)(a  -  l)/a,  7  =  (1  -  6)/a  and  C,  t  are  given  by  (18)-(19).   Replacing  a  in  (B.7) 
yields  the  desired  results. 

In  this  equilibrium,  all  agents  choose  the  same  constant  values  v  and  r.  This  is  in  fact  the  only  solution, 
at  least  in  the  following  sense.    Consider  any  finite  horizon  (T  <  00)  version  of  the  model,  without  the 
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restrictions  v\  =  v,  ,rf  =  f.  A  backwards  induction  similar  to  the  derivations  above  shows  that  in  sach 
period:  (a)  v\  =  vt  and  t\  =  rt,  ensuring  Yt  =  vt-Ht]  (b)  Wt(h)  =  at-\og(h)+bf\og(At)  —  ct-A2  +dt,  where 
the  (at  ,bt,ct,dt)'s  satisfy  linear  difference  equations  whose  fixed  points  are  (a,  6,  c,  d)  calculated  above.  The 
finite  game  thus  has  a  unique  Markov  perfect  equilibrium,  and  it  involves  symmetric  strategies.  As  T  — *  oo, 
it  tends  to  the  equilibrium  derived  earlier. 

2.  Public  education.  We  start  again  with  some  useful  simplifications.  First,  let  iPx  =  v  for  all  j  ^  t,  so 
that  the  Markov  process  (At,  A2)  fully  describes  the  state  of  economy.  We  relax  this  restriction  later  on. 
Note  from  (17)  and  the  analog  of  (28)  under  integration  that  the  tax  rate  ft  implemented  at  t  influences 
At+i  through  Qt  =  k  ■  (1  —  D)6  ■  (Dft)1-1,  but  does  not  affect  A^.  Second,  let  agent  i  choose  t\  as  if  she 
expected  to  be  the  decisive  voter  (or  a  dictator)  not  just  at  t  but  in  all  future  periods.  Because  this  leads 
to  the  same  preferred  tax  rate  for  all  agents,  any  other  voting  game  (e.g.,  agent  i  chooses  rt'  as  if  she  were 
decisive  at  t  but  expected  the  median  voter  to  prevail  at  t'  >  t)  will  lead  to  same  outcome.  With  these 
assumptions,  agent  i's  Bellman  equation  and  first  order  conditions  are  therefore: 


(B.9) 


(B.10) 


V(h\,At,Al)     =     ma*(„>r){bg((l-r).(i/A{)s^(y«)') 

+  p-Ev(K-ci-((i-^)h\y.(rYty-s,  if+1)A?+1)} 


1     (<T-\ 


P6 


l-v\ 


Qi  r 


(B.ll) 


p(l-S) 


1-** 


■E 


dV     ■        -         -  2 
h\+i  ■  -Qfr(ht+i>At+i,At+1) 


+p-&T'E 


dV 


K+\  •  -Q^iht+ii  At+i, &t+i) 


Equation  (B.ll),  which  determines  t\* ,  corresponds  to  voters  who  internalize  the  effect  of  taxes  on  At+\. 
If  they  do  not,  the  corresponding  rate  f\  is  obtained  by  dropping  the  last  term.  Again,  we  guess  the 
form  of  the  value  function:  V(h,A,A2)  =  a  ■  log(ft)  +  6  •  log(i)  -  c  ■  A2  +  d.  Then  (B.lO)-(B.ll)  imply 
v\  =  0,  f\  =  f,  t\*  =  r*,  where: 


(B.12) 


1 


l+p6a<r/(o--  1) 
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mim  r    -      Pi\-W     j     g(i^j)(a  +  S)    _ 

1         ^  "     l+p(l-*)a  <    l  +  p(l-«)(a  +  S) 

While  f  is  based  on  the  private  marginal  value  of  human  capital  d,  r*  reflects  the  full  social  marginal  value 
a  +  b.  Replacing  in  (B.9)  and  using  the  recursion  equations  for  (At+i,  A?+1)  leads  to: 

(B14)  a=T^-'  a  +  6-rT7-  c-2^'       i-ptt2 

Replacing  in  (B.12)  and  (B.13)  leads  to  the  desired  results.    Note  that  a  <  a  <  a  +  b,  implying  that 
f  <  t  <  r* . 

To  exclude  other  equilibria,  consider  again  the  finite  horizon  game,  without  the  restriction  \?%  =v.  An 
agent's  value  function  and  (Markov)  strategy  depend  on  (h\,\ix),  where  (it  is  the  distribution  of  human 
capital.  Whether  voters  internalize  the  effect  of  t\  on  fit+i  or  not,  backwards  induction  easily  shows  that  in 
each  period:  (a)  v\  —  vt,  ensuring  in  particular  that  p,t  remains  log-normal  and  that  Yt  =  VfHu  (b)  t\  =  ft, 
i.e.  there  is  unanimity  over  the  sequence  of  tax  rates;  (c)  Vt(h,  fit)  =  at  •  \og(h)  +  bt  ■  log(At)  —  ct  ■  A?  +  dt, 
where  the  (at,bt,ct,dtys  satisfy  linear  difference  equations  whose  fixed  points  are  (a,b,c,d)  calculated 
above.  Letting  the  horizon  tends  to  infinity  concludes  the  proof. 
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